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The List-Decoding Size of Fourier-Sparse Boolean 

Functions 
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Abstract 

A function defined on the Boolean hypercube is k-Fourier-sparse if it has at most k nonzero 
Fourier coefficients. For a function / : Fj —> IR and parameters k and d, we prove a strong 
upper bound on the number of Ic-Fourier-sparse Boolean functions that disagree with / on 
at most d inputs. Our bound implies that the number of uniform and independent random 
samples needed for learning the class of fc-Fourier-sparse Boolean fimctions on n variables 
exactly is at most 0{n-k log k). 

As an application, we prove an upper bound on the query complexity of testing Booleanity 
of Fourier-sparse functions. Our bound is tight up to a logarithmic factor and quadratically 
improves on a result due to Gur and Tamuz (Chicago J. Theor. Comput. Sci., 2013). 


1 Introduction 

Functions defined on the Boolean hypercube {0,1}” = F 2 are fundamental objects in theoretical 
computer science. It is well known that every such function / : F 2 —t IR can be represented as a 
linear combination 

/= E fis)-xs 

SC[n] 

of the 2” functions {ys}sc[n] defined by Xs{^) — ( — 1)^'^®^'. This representation is known as 
the Fourier expansion of the function /, and the numbers /(S) are known as its Fourier coefficients. 
The Fourier expansion of functions plays a central role in analysis of Boolean functions and finds 
applications in numerous areas of theoretical computer science including learning theory, prop¬ 
erty testing, hardness of approximation, social choice theory, and cryptography. For an in-depth 
introduction to the topic the reader is referred to the book of O'Donnell If22l . 

A classical result in learning theory is a general algorithm due to Kushilevitz and Mansour IT^ . 
based on results of Linial, Mansour, and Nisan 1201 and Goldreich and Levin 1121 . which enables 
to efficiently learn classes of Boolean functions with a "simple" Fourier expansion. A common 
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notion of simplicity of Fourier expansion is its sparsity. A function is said to be k-Fourier-sparse 
if it has at most k nonzero Fourier coefficients. It follows from 1191 that given query access to a 
fc-Fourier-sparse Boolean function / : IF 2 —t {0,1} it is possible to estimate its Fourier coefficients 
and to get a good approximation of / in running time polynomial in n and k. Later, it was shown 
that such running time even allows to reconstruct the function / exactly Ifl3l . 

In recent years, properties of the Fourier expansion of functions were studied in the property 
testing framework. We now mention some of those results; since this will not be needed for the 
sequel, the reader can skip directly to the description of our results in the next section. Gopalan, 
O'Donnell, Servedio, Shpilka, and Wimmer considered in IT^ the problem of testing if a given 
Boolean function is k-Fourier-sparse or e-far from any such function. Another problem studied 
there is that of deciding if a function is k-Fourier-dimensional, that is, the Fourier support, viewed 
as a subset of F 2 , spans a subspace of dimension at most k, or e-far from satisfying this property. 
Gopalan et al. llBl established testers for these properties whose query complexities depend only 
on k and e. For fc-Fourier-sparsity the query complexity was a certain polynomial in k and 1 /e 
and for k-Fourier-dimensionality it was 0{k ■ 1 1 ). They also proved lower bounds of Cl{'/k) 
and 0(2^^^) respectively. Another parameter associated with Boolean functions is the degree of 
its representation as a polynomial over F 2 . The algorithmic task of testing if a function has F 2 - 
degree at most d or is e-far from any such function was considered by Alon et al. |!T| and then by 
Bhattacharyya et al. HI, who proved tight upper and lower bounds of 0(2^^ -|- 1/e) on the query 
complexity. Note that all the above properties fall into the class of linear-invariant properties, 
i.e., properties that are closed under compositions with any invertible linear transformation of the 
domain. These properties have recently attracted a significant amount of attention in the attempt 
to characterize efficient testability of them (see [241151 for related surveys). 

1.1 Our Results 

List-decoding size. Our main technical result from which we derive all other results is concerned 
with the list-decoding size of Fourier-sparse Boolean functions. In general, the list-decoding prob¬ 
lem of an error correcting code for a distance parameter d asks to find all the codewords whose 
Hamming distance from a given word is at most d. Here we consider the (non-linear) binary 
code of block length 2” whose codewords represent all the fc-Fourier-sparse Boolean functions on 
n variables. 

It is not difficult to show that the total number of such functions is at most Indeed, 

there are ways to choose the support of /, and ways to set those Fourier coefficients 

which must all be integer multiples of 2^" in [—1, -|-1]. It is also not difficult to show that the 
distance between any two distinct codewords is at least 2” / k. Indeed, it is known that every k- 
Fourier-sparse Boolean function has F 2 -degree d < log 2 k (see, e.g., (H Lemma 3]), and therefore, 
by the Schwartz-Zippel lemma, every two distinct fc-Fourier-sparse Boolean functions disagree on 
at least 1/k fraction of the inputs. As a result, for every function / : F 2 —)■ IR there is at most one 
codeword of distance smaller than 2”/ {2k) from /. 

We are not aware of any other known bounds beyond those two naive ones. We address this 
question in the following theorem. 
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Theorem 1.1. For every function / : —)■ R, the number of k-Fourier-sparse Boolean functions of 

distance at most dfrom f is 

We observe that for certain choices of k and d the bound given in Theorem 11.11 is tight. For 
example, let / be the constant zero function, let k < 2^'^” be a power of 2, and take d = 2"/k. 
Consider all the indicator functions of linear subspaces of F^ of co-dimension log 2 k. Every such 
function is of distance d from / and is k-Fourier-sparse (see Claim IZ^ . The number of such func¬ 
tions is 


Learning from samples. As an application of the list-decoding bound, we next consider the 
problem of learning the class of fc-Fourier-sparse Boolean functions on n variables (exactly) from 
uniform and independent random samples (see, e.g., ll2l lT8ll for related work). Let us note already 
at the outset that all the results mentioned here are not efficient: it is not known if there is an 
algorithm for the problem whose running time is some fixed polynomial in n times an arbitrary 
function of k. Among other things, such an algorithm would imply a breakthrough on the long¬ 
standing open question of learning juntas from samples Il7l [^l25l[l8l . 

The question of recovering a function that is sparse in the Fourier (or other) basis from a few 
samples is the central question in the area of sparse recovery. It has been intensely investigated 
for over a decade and, among other things, has applications for compressed sensing and for the 
data stream model. The best previously known bounds on our question are 0(n ■ klog^k) < 
0{n'^ ■ k) due to Cheraghchi, Guruswami, and Velingker ITTII and 0(n^ • klog/c) < 0(n^ ■ k) due 
to Bourgain [8], improving on a previous bound of Rudelson and Vershynrn Il23ll (who themselves 
improved on the work of Candes and Tao IITOl l. We note in passing that they actually answer a 
harder question: first, because they handle all functions, not necessarily Boolean-valued; second, 
because they show that a randomly chosen set of sample locations of the above cardinality is 
good with high probability simultaneously/or all k-Fourier-sparse functions (sometimes known 
as the "deterministic" setting), whereas we only want a random set of sample locations to be good 
with high probability for any fixed k-Fourier-sparse function (the "randomized" setting); finally, 
because they obtain the recovery result by proving a "restricted isometry property" of fhe Fourier 
matrix which among other things implies a recovery algorithm running in time polynomial in 2” 
and k. 

Using Theorem ll.li we improve the upper bound on the sample complexity of learning Fourier- 
sparse Boolean functions. 

Corollary 1.2. The number of uniform and independent random samples required for learning the class of 
k-Fourier-sparse Boolean functions on n variables is 0{n ■ klogk). 

We believe that our better bound and its elementary proof shed more light on the problem and 
might be useful elsewhere. In fact, in a follow-up work IHSH we employ fhe techniques developed 
here to study the "restricted isometry property" of random submatrices of Fourier (and other) 
matrices, improving on the aforementioned works lUTl |8|. We finally nofe fhat a lower bound 
of n(k ■ (n — log 2 k)) on the sample complexity can be obtained by considering the problem of 
learning indicator functions of affine subspaces of F 2 of co-dimension log 2 k (see Theorem l3.7t see, 
e.gv EJ for the same lower bound in a different setting). 
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Testing Booleanity. We next consider the problem of testing Booleanity of Fourier-sparse func¬ 
tions, which was introduced and studied by Gur and Tamuz in [1^. In this problem, given access 
to a Ic-Fourier-sparse function / : F 2 —)■ R, one has to decide if / is Boolean, i.e., its image is con¬ 
tained in {0,1}, or not. The objective is to distinguish between the two cases with some constant 
probability using as few queries to / as possible. It was shown in llT^ that there exists a (non- 
adaptive one-sided error) tester for the problem with query complexity O(fc^), and that every 
tester for the problem has query complexity Cl{k). Flere, we use our result on learning /c-Fourier- 
sparse Boolean functions to improve the upper bound of and prove the following. 

Theorem 1.3. For every k there exists a non-adaptive one-sided error tester that using 0{k ■ log^ k) queries 
to an input k-Fourier-sparse function / : F 2 —> R decides if f is Boolean or not with constant success 
probability. 

We note that, while the tester established in Theorem II.31 has an improved query complexity, 
it is not clear if it is efficient with respect to running time. It can be shown, though, that using the 
learning algorithm of Fourier-sparse functions that follows from [8j |15l (instead of Corollary 11.21) 
in our proof of Theorem ll.31 one can obtain an efficient algorithm (running in time polynomial in 
n and k) with the slightly worse query complexity of 0{k ■ log^ k). 

Finally, we complement Theorem [L3]by the following nearly matching lower bound. 

Theorem 1.4. Every non-adaptive one-sided error tester for Booleanity of k-Fourier-sparse functions has 
query complexity n(fc ■ logfc). 

1.2 Overview of Proofs 

1.2.1 The List-Decoding Size of Fourier-Sparse Boolean Functions 

In order to prove Theorem 11.11 we have to bound from above the number of /c-Fourier-sparse 
Boolean functions of distance at most d from a general function / : F 2 —?■ R. In the discussion 
below, let us consider the special case where / is the constant zero function. The general result 
follows easily. 

Here, we have to bound the number of GFourier-sparse Boolean functions g : F 2 —t {0,1} 
of support size at most d. We start by observing using Parseval's theorem that such functions 
have small spectral norm ||g||i = Esc[fi] |^(S)|. Next, we observe that the Fourier expansion 
of the normalized function g/||y||i is a convex combination of functions thus can be 

viewed, following a fechnique of Bruck and Smolensky |9j, as an expecfation over a disfribution 
on the S's. Using the Chernoff-Hoeffding bound and the bound on the spectral norm, we obtain 
a succinct representation for every such function g. The ability to represent these functions by a 
binary string of bounded length yields the upper bound on their number. We note that the proof 
approach somewhat resembles that of the upper bound on the list-decoding size of Reed-Muller 
codes due to Kaufman, Lovett, and Porat ITtI . 
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1.2.2 Learning Fourier-Sparse Boolean Functions 

As a warmup, let us mention an easy upper bound oi 0{n ■ k^). This follows by recalling that 
there are at most /c-Fourier-sparse Boolean functions, and that each one differs from any 

fixed function on at least \/k fraction of the inputs. Hence by the union bound, after 0(n • k^) 
samples all other functions will be eliminated. 

The improved bound in Corollary 11.21 follows similarly using the list-decoding result of The¬ 
orem [LTI Namely, we apply the union bound separately on functions of different distances from 
the input function. Functions that are nearby are harder to "hit" using random samples, but by 
the theorem, there are few of fhem; functions that are further away are in abundance, but they are 
easier to "hit" using random samples. 

1.2.3 Testing Booleanity of Fourier-Sparse Functions 

The testing Booleanity problem is somewhat different from typical property testing problems. 
Indeed, in property testing one usually has to distinguish objects that satisfy a certain property 
from those that are e-far from the property for some distance parameter e > 0. However, here the 
tester is required to decide if fhe function satisfies the Booleanity property or not, with no distance 
parameter involved. This unusual setting makes sense in this case because Fourier-sparse non- 
Boolean functions are always quite far from every Boolean function. More precisely, the authors 
of ITU used the uncertainty principle (see Proposition 12.11) to prove that every k-Fourier-sparse 
non-Boolean function / : —)■ R is non-Boolean on at least Q(2"/fc^) inputs (see Claim This 
immediately implies a (non-adaptive one-sided error) tester that uses O(fc^) queries: just check 
that / is Boolean on O(fc^) uniform inputs in F^. 

The analysis of IIT4II turns out to be tight, as there are fc-Fourier-sparse non-Boolean functions 
that are not Boolean at only 0(2”/fc^) points. Indeed, for an even integer n, consider the function 
/ : F 2 —> {0,1,2} defined by 

f (:^l/ • • • / ANF)(X]^, . . ., .^hJlI)(x,2/2+l/ • • • / ), (1) 

which is not Boolean at only one point and has Fourier-sparsity 2-2”^^ (see Claim 

Upper bound. We prove Theorem 11.31 using our learning result. Corollary 11.21 To do so, we 
first observe that a restriction of a k-Fourier-sparse non-Boolean function to a random subspace of 
dimension 0(logA:) is non-Boolean with high probability (see Lemma l4T]) . Since a restriction to a 
subspace does not increase the Fourier-sparsity, this reduces our problem to testing Booleanity of 
k-Fourier-sparse functions on n = 0(logA:) variables. Then, after 0{k ■ log^k) samples from the 
subspace, if a non-Boolean value was found then we are clearly done. Otherwise, by Corollary II.2[ 
the samples uniquely specify a Boolean candidate for the restricted function. Such a function 
must be quite far from every other k-Fourier-sparse function (Boolean or not; see Claim IZ2l) . This 
enables us to decide if the restricted function equals the Boolean candidate function or not. 

Lower bound. The upper bound in Theorem II.31 gets close to the Cl{k) lower bound proven by 
Gur and Tamuz in lH^ . For their lower bound, they considered the following two distributions: 
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(a) the uniform distribution over all Boolean n-variable functions that depend only on their first 
log 2 k variables; (b) the uniform distribution over all n-variable functions that depend only on 
their first log 2 k variables and return a Boolean value on fc — 1 of the assignments to the relevant 
variables and the value 2 otherwise. It can be easily seen that any (possibly adaptive) tester that 
distinguishes with some constant probability between distributions (a) and (b) has query com¬ 
plexity d{k). Since the first distribution is supported on fc-Fourier-sparse Boolean functions and 
the second on fc-Fourier-sparse non-Boolean functions, this implies that the same lower bound 
holds for the query complexity of testing Booleanity of Ic-Fourier-sparse functions. 

Note that the distributions considered above are supported on log 2 fc-Fourier-dimensional func¬ 
tions. It can be seen (say, using the uncertainty principle) that such functions are not Boolean on at 
least 1/k fraction of their inputs, so 0{k) random samples suffice for finding a non-Boolean value 
if exists. Flence, in order to get beyond the d{k) lower bound, we need to consider Ic-Fourier- 
sparse functions that are not Boolean at only o{l/k) fraction of the inputs - our functions will 
actually have 0{l/k^) fraction of such inputs. 

Specifically, we consider the distribution of functions obtained by composing the function / 
given in ([T]) with a random invertible affine transformation. This is the class of functions that can 
be represented as a sum of two indicators of affine subspaces Vi, V 2 C F 2 of dimension 

M /2, which intersect at exactly one point. Intuitively, it seems that distinguishing the functions in 
this class from those where Vi and V 2 have empty intersection requires the tester to learn the affine 
subspaces Vi and a task that requires Q(n ■ 2”^^) queries. We prove such a lower bound for non- 
adaptive one-sided error testers. Since the above functions are k-Fourier-sparse for k = 0(2”^^), 
the obtained lower bound is d{k ■ logfc). 

2 Preliminaries 

Let [n] denote the set {1,..., n}. A function / : F 2 —>• IR is Boolean if its image is contained in 
{0,1} and is non-Boolean otherwise. The distance between two functions /, g : F 2 —)■ R, denoted 
dist(/,g), is the number of vectors x G F 2 for which f{x) / S'(x). 

Fourier Expansion 

For every S C [n], let ,ys • ^2 denote the function defined by = ( —1)^'^®^'. 

It is well known that the 2" functions {A^s}sc[n] form an orthonormal basis of the space of func¬ 
tions F 2 —t R with respect to the inner product {f,g) = Ex[/(^) • S'(^)]/ where x is distributed 
uniformly over F^. Thus, every function / : F 2 —>• R can be uniquely represented as a linear 
combination / = Lisc[fi] / (S) ■ Xs of this basis. This representation is called the Fourier expansion 
of /, and the numbers f{S) are referred to as its Fourier coefficients. The support of / is defined 
by supp(/) = {x G F 2 I /(x) 7 ^ 0} and the support of /, known as the Fourier spectrum of /, by 
supp(/) = {S C [n] I /(S) f 0}. We say that / is /c-fown'er-spflrsji] if I supp(/)| < k. For every 

^Boolean functions are sometimes defined in the literature with range { — 1, -1-1} rather than {0,1}. Notice that this 
affects the Fourier-sparsity by at most 1. 
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p > 1 we denote ||/||p - (Esc[„] IfiSWV^P- For p = 1, ||/||i is known as the spectral norm of /. 
Parseval's theorem states that Ex[/(x)^] = ||/|| 2 - 

The uncertainty principle says that there is no nonzero function / for which the supports of 
both / and / are small (see, e.g., Exercise 3.15]). We state it below with two simple conse¬ 
quences. 

Proposition 2.1 (The Uncertainty Principle). For every nonzero function / : F 2 —R, 

|supp(/)|.|supp(/)|>2T 

Claim 2.2. For every two distinct k-Fourier-sparse functions /,g : F 2 —> R, dist(/,g) > 2”/ {2k). 
Proof: Apply Proposition l2.1l to the function f — g, whose Fourier-sparsity is at most 2k. ■ 

Claim 2.3 (111). For every k-Fourier-sparse function / : F 2 —t R, if / is non-Boolean then 

|{xeF;|/Wi {o,i)}|>-p-j^.2". 

Proof: Apply Proposition l2.ll to the function f ■ {f — 1), whose Fourier-sparsity is at most 

|{SAT I S,r e supp(/)}| |supp(/)| < (^ +k + l, 

where A stands for symmetric difference of sets. ■ 

We also need the following simple claim. 

Claim 2.4. For every affine subspace V QW 2 of co-dimension k, the indicator function ly : > {0,1} 

is 2f-Fourier-sparse. 


Proof: Since V has co-dimension k, there exist ai,... ,aj^ G F 2 and hi,... G E 2 such that V = 
{x G F 2 I = bi, i = l,...,k}. For every i, let S, C [n] denote the set whose characteristic 

vector is fl;, and observe that for every x G F 2 , 



2 


This representation implies that ly is 2^-Fourier-sparse. 


Chernoff-Hoeffding Bound 


Theorem 2.5. Let Xi,...,X^ be N identically distributed independent random variables in [—a,-\-a] 
satisfying E[X/] = pfor all i. Then for every S < 1/2 and N > C ■ a^ ■ \og{l/S) /, for a universal 
constant C, it holds that 


Pr 




-I N 

v-£x. 


< £ 


> 1 -^. 
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3 The List-Decoding Size of Fourier-Sparse Boolean Functions 


We turn to prove Theorem ll.li which provides an upper bound on the list-decoding size of the 
code of block length 2” of all k-Fourier-sparse Boolean functions on n variables. Equivalently, for 
a general distance d and a function / : t IR we bound the number of k-Fourier-sparse Boolean 

functions on n variables of distance at most d from /. 

We start by proving that a function / : F^ —t K with small spectral norm can be well ap¬ 
proximated by a linear combination of few functions from {Ts}sc[n] with coefficients of equal 
magnitude. This was essentially proved in ||9| and we include here the proof for completeness. 


Lemma 3.1. For every function / : F 2 —t R, e > 0, and 5 G (0,1/2], there exists a coUectioj^ T of 
0{\\f\\l-log{l/S)/e^) subsets of [n] with signs {as G {±l})seT such that for all but at most d fraction 
of X E IF 2 holds that 

fix) - 


|F-| 


E ‘^s-Xsix] 

seT 


< e 


Proof: Observe that the function / can be represented as follows. 


/= E /(S)'V 

SC[f7] 


E YP'll/lll'Sign(/(S)).A:s 

sc[„] 11/111 


^F;^[||/||i •sign(/(S)) -jys]. 


where D is the distribution defined by D(S) = l/(S)|/||/||i- Let J-” be a collection of |J-'| = 
0(11/11^ ■ log(l/^)/£^) independent random samples from the distribution D. For every x G F^h 
the Chernoff-lToeffding bound ITheorem 12.51) implies that with probability at least 1 — ^ it holds 
that 


fix) 


1 

W\ 


E \\fh-^s-xsix) 

SeF 


< £, 


( 2 ) 


where as = sign(/(S)). By linearity of expectation, it follows that there exist T and signs {as)seF 
for which (|2]) holds for all but at most S fraction of x G F 2 ^ as required. ■ 


We now apply Lemma l3.ll to Fourier-sparse functions in F 2 —t {—1,0,+1} with bounded 
support size, and then, in Corollary 13.31 derive an upper bound on the number of these functions. 

Corollary 3.2. Let f : F 2 —t { — 1,0,-1-1} be a k-Fourier-sparse function satisfying |supp(/)| < d. 
Then for every d G (0,1/2] there exists a collection T of 0{dk\og{l/d) /2") subsets of [n] with signs 
{as G {±l})sg7- such that for all but at most d fraction of x G F2 it holds that 


fix) 



E ^s-Xsix) 

SeF 


< 


1 

2 ■ 


^Repetitions of subsets in the collection F are allowed. 
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Proof: By the Cauchy-Schwarz inequality and Parseval's theorem, we obtain that 

< E ns? = 2-» ■ E n>:f < t 

sc[«] xeFf 

The corollary follows from Lemma l3Tl applied with e = 1/2, for |J^| = 0(||/||f log(l/^)/£^) = 
0(dklog(l/^}/2^). m 

Corollary 3.3. The number of k-Fourier-sparse functions f : t { —1,0,+1} satisfying \ supp(/)| < 

^jg20(«dlclogJc/2")_ 

Proof: For every fc-Fourier-sparse function / : F^ —t { —1,0,+1} satisfying | supp(/)| < d, let iF 
and (fls)s 67 't)e as given by Corollary l3.2l for. say, 3 = l/{5k). Since the range of/is { —l,0,+l},it 
follows that the collection T, the signs {as)seTr and the value of ||/||i define a function of distance 
at most 5 ■ 2” from f. Notice that by Claim 12.21 and our choice of 5, the distance between every 
two distinct fc-Fourier-sparse functions is larger than 23 ■ 2”. Thus, a function of distance at most 
3 ■ 2” from / fully defines /. This implies that / can be represented by a binary string of length 
0 (n • dklog k/2"), so the total number of such functions is ■ 

The bound in Corollary l3.3l implies a bound on the number of Fourier-sparse Boolean functions 
of bounded distance from a given Boolean function. 

Corollary 3.4. For every k-Fourier-sparse Boolean function / : F 2 —t {0,1}, the number of k-Fourier- 
sparse Boolean functions of distance at most dfrom f is 

Proof: Let / : F 2 — t {0,1} be a fc-Fourier-sparse Boolean function. Consider the mapping that 
maps every Ic-Fourier-sparse Boolean function y : F 2 —t {0,1), whose distance from / is at 
most d, to the function h = f — g- Observe that h is a 2fc-Fourier-sparse function from F 2 to 
{—1,0, -1-1} satisfying | supp(l 2 ) | < d. By Corollary l3.3l the number of such functions h is bounded 
by 'fy^{ndk\ogk/i'')_ Since the above mapping is bijective, this bound holds for the number of func¬ 
tions g as well. ■ 

Equipped with Corollary 13.31 we restate and prove Theorem ll.il 

Theorem ll.li For every function / : F 2 —t R, the number of k-Fourier-sparse Boolean functions of 
distance at most dfrom f is . 

Proof: If there is no /c-Fourier-sparse Boolean function of distance at most d from /, then the bound 
trivially holds. So assume that such a function g : t {0,1} exists. Observe that every k- 

Fourier-sparse Boolean function of distance at most d from / has distance at most 2d from g. Thus, 
by Corollary 13.41 applied to g, the number of such functions is at most ■ 
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3.1 The Sample Complexity of Learning Fourier-Sparse Boolean Functions 

The sample complexity of learning a class of functions is the minimum number of uniform and in¬ 
dependent random samples needed from a function in the class for specifying it with high success 
probability. Here we consider the class of fc-Fourier-sparse Boolean functions on n variables, and 
show how Theorem 11.11 implies an upper bound on the sample complexity of learning it (Corol¬ 
lary [Till. 

Theorem 3.5. For every n, 1 < fc < 2”, and a k-Fourier-sparse function / : F 2 —t R, the following holds. 
The probability that when sampling 0{n ■ klogk) uniform and independent random samples from f, there 
exists a k-Fourier-sparse Boolean function g f that agrees with f on all the samples is 

Proof: Consider cj = 0{nk\ogk) samples (x,/(x)) from a Ic-Fourier-sparse function / : F 2 —t R, 
where x is distributed uniformly and independently in F^b By Claim the distance between 
/ and every other fc-Fourier-sparse function is at least 2'V {2k). For an integer I G [1, [log 2 21cJ], 
consider all the fc-Fourier-sparse Boolean functions whose distance from / is in [ 2 ”“^, By 

Theorem 11.11 the number of such functions is 20 (f!/clogA:/ 2 *) probability that such a function 
agrees with q random independent samples of / is at most (1 — 2^^)T By the union bound, the 
probability that at least one of fhese functions agrees with the q samples is at most 

20(nklogk/2^) _ 

where the last inequality holds for an appropriate choice oi q = 0{nklogk). By applying the 
union bound over all the values of £, it follows that with probability 1 — all the k- 

Fourier-sparse Boolean functions (besides /) are eliminated, completing the proof. ■ 

The following corollary follows immediately from Theorem l3.5l and confirms Corollary ll.2[ 

Corollary 3.6. For every n and 1 < fc < 2”, the number of uniform and independent random samples 
required for learning the class of k-Fourier-sparse Boolean functions on n variables with success probability 
l_ 2 -^{niogk) isO{n-klogk). 

We end with the following simple lower bound. 

Theorem 3.7. For every n and 1 < k < 2", the number of uniform and independent random samples 
required for learning the class of k-Fourier-sparse Boolean functions on n variables with constant success 
probability is n.{k- {n — log 2 k)). 

Proof: Assume without loss of generality that /c is a power of 2. Let A be an algorithm for learning 
the class above with constant success probability p > 0 using q uniform and independent random 
samples. Consider the class G of indicators of affine subspaces of F 2 of co-dimension log 2 k (i.e., 
affine subspaces of F 2 of size 2” /fc). By Claimthe functions in G are fc-Fourier-sparse. Observe 
that their number satisfies 

\G\ = 2®(”-“^™(log2L«-log2lc))_ 
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By Yao's minimax principle, there exists a deterministic algorithm A' (obtained by fixing the ran¬ 
dom coins of A) that given evaluations of a function, chosen uniformly at random from Q, on a 
fixed collection of q points in F^, learns it with success probability p. 

Now, observe that the expected number of 1-evaluations that A' receives is q/k. By Markov's 
inequality, the probability that A' receives at least 2q/(pk) 1-evaluations is at most p/2. It follows 
that for at least p 12 fraction of the functions in Q the algorithm A! receives at most 2ql (pk) 1- 
evaluations and learns them correctly. Assuming that pk > 2, the number of possible evaluation 
sequences on these inputs is at most 


2q/{pk) 

L 

i=0 


< {k- ^ 


where for the first inequality we used the standard inequality (1) — which holds for 

t < q (see, e.g., HU Proposition 1.4]). The above is bounded from below by \Q\ ■ p/2, implying 
that 

q > Cl{n ■ min(log 2 k,n — log 2 k) ■ k/ log 2 k) > Cl{k ■ {n — log 2 k)), 

where the last inequality follows by considering separately the cases of k > 2”/2 and k < 2"/2. In 
case that pk < 2, the number of possible evaluation sequences is at most 2*?, and the bound follows 
similarly using the assumption that p is a fixed constant. ■ 


4 Testing Booleanity of Fourier-Sparse Functions 

In this section we prove upper and lower bounds on the query complexity of testing Booleanity 
of Fourier-sparse functions. For a parameter k, consider the problem in which given access to a 
fc-Fourier-sparse function / : F^ —>• R one has to decide if / is Boolean, i.e., f{x) G {0,1} for every 
X G F 2 , or not, with some constant success probability. 

4.1 Upper Bound 

As mentioned before, Gur and Tamuz proved in |14ll that every fc-Fourier-sparse non-Boolean 
function f on n variables satisfies f{x) ^ {0,1} for at least Q(2” //c^) inputs x G F 2 (see Claim 1231) . 
Thus, querying the input function / on 0{k^) independent and random inputs suffices in order to 
catch a non-Boolean value of / if such a value exists. In the following lemma it is shown that it is 
not really needed to choose the 0{k^) random vectors independently. It turns out that a restriction 
of a /c-Fourier-sparse non-Boolean function to a random linear subspace of size 0{k^), that is, of 
dimension « 2 log 2 k, is with high probability non-Boolean. Thus, the tester could randomly pick 
such a subspace and query / on all of its vectors. This decreases the amount of randomness used 
in the tester of |[14l from 0{nk^) to 0(nlog/:). More importantly for us, this reduces the problem 
of testing Booleanity of /:-Fourier-sparse functions on n variables to the case of k = 0(2"^^). 

Lemma 4.1. Let f -^2 R a k-Fourier-sparse non-Boolean function, and denote L = {k^ k2) / 2. 
Then, for every 5 > B, the restriction of f to a uniformly chosen random linear subspace of dimension 
^ non-Boolean with probability at least 1 — 5. 
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Proof: Let / : t IR be a fc-Fourier-sparse non-Boolean function. By Claim 12.31 there are at 

least 2"IL vectors x G F 2 for which f{x) ^ {0,1}. This implies that there exists a set S of at 
least log2(2"/L) linearly independent vectors in F 2 on which f is not Boolean. Consider a linear 
subspace 1/ C F 2 of dimension n — 1 chosen uniformly at random. Since the vectors in S are 
linearly independent, the probability that no vector in S is in 1/ is 2“I®I < It follows that the 
restriction / |y of / to V is a Ic-Fourier-sparse function defined on a linear subspace of dimension 
n — 1, and its probability to be Boolean is at most Note that one can think of the domain 
of f\v as F 2 ^^, because V and F^^^ are isomorphic and a composition with an invertible linear 
transformation does not affect the Fourier-sparsity. Now, let us repeat the above process n — r — 1 
additional times, until we get a linear subspace of dimension r. The probability that the function 
becomes Boolean in one of the steps is at most 

L L L L , 

-1-T H-H-T < — < <5, 

2n 2'^^! 2’’+^ — 2’’ — 

and we are done. ■ 

We now restate and prove Theorem 11.31 which gives an upper bound of 0(A: ■ log^k) on the 
query complexity of testing Booleanity of k-Fourier-sparse functions. In the proof, we first apply 
Lemma [ 4.11 to restrict the input function to a subspace of dimension O(logk). Then, we apply 
Theorem 13.51 in an attempt to learn the restricted function and check if it is consistent with some 
fc-Fourier-sparse Boolean function. 

Theorem ll.31 For every k there exists a non-adaptive one-sided error tester that using 0{k ■ log^ k) queries 
to an input k-Fourier-sparse function / : F 2 —t IR decides if f is Boolean or not with constant success 
probability. 

Proof: Consider the tester that given access to an input k-Fourier-sparse function / : F 2 —t IR acts 
as follows: 

1. Pick uniformly at random a linear subspace V of F 2 of dimension r = min(n, [log 2 ( 100 L)]), 

where L = + 2) /2, and let T be an invertible linear transformation mapping F 2 to V. 

2. Query / on 0(r ■ A: log k) random vectors chosen uniformly and independently from the sub¬ 
space V. Note that these queries can be seen as uniform and independent random samples 
from the function g : F 2 —t IR defined as g = f oT. 

3. If there exists a Ic-Fourier-sparse Boolean function on r variables that agrees with the above 
samples of g then accept, and otherwise reject. 

We turn to prove the correctness of the above tester. If / is a fc-Fourier-sparse Boolean function 
then so is g, because a restriction to a subspace and a composition with a linear transformation 
leave the function Ic-Fourier-sparse and Boolean. Hence, in this case the tester accepts with prob¬ 
ability 1 . 

On the other hand, if / is a fc-Fourier-sparse non-Boolean function, then by Lemma |4T] the re¬ 
striction of / to the random subspace V of dimension r picked in Item[TJ as well as the function g 
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defined in Item|2l are also non-Boolean with probability at least 0.99. In this case, by Theorem l3.5i 
the probability that there is a /c-Fourier-sparse Boolean function on r variables that agrees with 
0 (r • klogk) uniform and independent random samples from g is thus the tester cor¬ 

rectly rejects with probability at least, say, 0.9, as required. Finally, observe that the number of 
queries made by the tester isO{r -k log k) = 0{k- log^ k). ■ 

4.2 Lower Bound 

We turn to restate and prove our lower bound on the query complexity of testing Booleanity of 
fc-Fourier-sparse functions. 

Theorem ll.4[ Every non-adaptive one-sided error tester for Booleanity of k-Fourier-sparse functions has 
query complexity n.{k ■ logfc). 

Proof: For a given integer k, let n be the largest even integer satisfying k > 3- 2"^^. Define a 
distribution Dno over functions in —>• {0,1,2} as follows. Pick uniformly at random a pair 
(1^1/ 1 ^ 2 ) of affine subspaces satisfying dim(Vi) = dim(V 2 ) = w/2 and | Vi PI V 2 I = 1/ and output 
the sum of indicators -|- lv 2 - Notice that, by Claim [ZH such a function has Fourier-sparsity 
at most 2-2”^^ < k. Thus, a function chosen from D„o is k-Fourier-sparse and non-Boolean with 
probability 1. 

Let T be a non-adaptive one-sided error randomized tester for Booleanity of k-Fourier-sparse 
functions with query complexity q and success probability at least 2/3. By Yao's minimax prin¬ 
ciple, there exists a deterministic tester T' (obtained by fixing the random coins of T) that rejects 
a random function chosen from Dno with probability at least 2/3. Since T is non-adaptive and 
has one-sided error, it follows that T' queries an input function on q fixed vectors ai,... ,aq G F 2 , 
accepts every /c-Fourier-sparse Boolean function, and rejects a function chosen from Dno with prob¬ 
ability at least 2/3. We turn to prove that q > {n ■ 2”^^)/1000 = Cl{k ■ log/c). 

Assume in contradiction that q < [n ■ 2”^^)/1000. Let / be a random function chosen from 
Dno, that is, / = Ivj -|- for random affine subspaces V} and W of dimension n /2 satisfying 
IV} n V 2 I = 1- For i = 1,2, let W, be the affine span of {fli,..., a^} n 1/-. Let E be the event that the 
intersection of Wi and W 2 is empty. We turn to prove that if the event E happens then the tester 
T' accepts the function / and that the probability of this event is at least 0.9. This contradicts the 
success probability of T' on functions chosen from D„o and completes the proof. 

Lemma 4.2. If the event E happens then the tester T' accepts the function f. 

Proof: Assume that the event E happens, i.e., Wi n W 2 = 0. Then, there exists an affine subspace 
of dimension n/2 — 1 satisfying W 2 Q C V 2 and Vi H = 0. Consider the function 
g = Ivi + ly'- By Claimis a Boolean function whose Fourier-sparsity is at most 3 • 2”'^^ < k, 
thus it is accepted by T'. However, g satisfies = /(«;) for every 1 < i < q. This implies that 
T' cannot distinguish between g and /, so it must accept / as well. ■ 

Lemma 4.3. The probability of the event E is at least 0.9. 
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Proof: Denote by X the number of vectors in {a^,... ,aq} (IV^. Since Vi is distributed uniformly 
over all affine subspaces of dimension nil., the probability that fl, belongs to Vi is 2^"/^ for every 
I < i < c]. Thus, by linearity of expectation. 


E[X] = 


2^/2 


< 


(n-2”/2)/i000 

2n/2 


n 

iM' 


By Markov's inequality, we obtain that 


Pr 


dim(lVi) > 


n 1 
> — 

< Pr 

X > — 

- loJ 


L - loJ 



Now, fix a choice of Vi for which dim(Wi) < n/10, and consider the randomness over the 
choice of V 2 . Notice that, conditioned on Vi, V 2 is distributed uniformly over all the affine sub¬ 
spaces of dimension nil which contain exactly one vector from Vj. By symmetry, every vector 
of Vi has probability |Vi|“^ = 2^”^^ to belong to 14• Thus, the probability that the vector that 
belongs to both Vi and ^2 is in Wj is | Wi | ■ 1-^'^ < • l-^'^ = 

Finally, the probability that Wi n W 2 = 0 is at least the probability that Wi n 14 = 0/ and the 
latter is at least 1 — (0.01 -|- > 0.9 for every sufficiently large n. ■ 
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